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Abstract. The recent theory of graph limits gives a powerful frame- 
work for understanding the properties of suitable (convergent) sequences 
(G„) of graphs in terms of a limiting object which may be represented 
by a symmetric function W on [0, 1], i.e., a kernel or graphon. In this 
context it is natural to wish to relate specific properties of the sequence 
to specific properties of the kernel. Here we show that the kernel is 
monotone (i.e., increasing in both variables) if and only if the sequence 
satisfies a 'quasi-monotonicity' property defined by a certain functional 
tending to zero. As a tool we prove an inequality relating the cut and 
L 1 norms of kernels of the form W\ — Wi with Wi and Wi monotone 
that may be of interest in its own right; no such inequality holds for 
general kernels. 



1. Introduction 

Recently, Lovasz and Szegedy [20(] and Borgs, Chayes, Lovasz, Sos and 
Vesztergombi (see, e.g., [1]) developed a rich theory of graph limits, as- 
sociating limit objects to suitable sequences (G u ) of (dense) graphs with 
\G U \ —> oo, where \G U \ denotes the number of vertices of G v . The basics of 
this theory are outlined in Section [2] below; see also Diaconis and Janson [8]. 
These graph limits (which are not themselves graphs) can be represented 
in several different ways; perhaps the most important is that every graph 
limit can be represented by a kernel (or graphon) on [0,1], i.e., a symmetric 
measurable function W : [0, l] 2 —¥ However, this representation is 

in general not unique, see e.g. [13, 0, H, y]- More generally, kernels can be 
defined on any probability space, see Section [2l 

We use r to denote an arbitrary graph limit, and write Tw for the graph 
limit defined by a kernel W. We say that two kernels W and W are equiva- 
lent if they define the same graph limit, i.e., if T\y = T\yi . We write G u — > Y 
when the sequence (G u ) converges to T (see 20], [B[ and Section [2] below for 



definitions); if T is represented by a kernel W, i.e., if T = FV, we also write 
G v -> W. 
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Following [8j , we denote the set of all graph limits by Uoo , and note that 
lAco is a compact metric space. Another version of the important compact- 
ness property for graph limits is that every sequence (G u ) of graphs with 
IG^I — > oo has a convergent subsequence, i.e., a subsequence converging to 
some T G Z^oo • 

Given a suitable class T of graphs, it seems interesting to study the graph 
limits of T , i.e., the set of graph limits arising as limits of sequences of 
graphs in T . One interesting example is the class of threshold graphs, which 
has several different characterizations, see e.g. (23[. One of them is the 
monotonicity property of the neighbourhoods N(v) of the vertices: 

There exists a (linear) ordering -< of the vertices such that 

if v -< w, then N(v) \ {v, w} C N(w) \ {v, w}. (1.1) 

The graph limits of threshold graphs were studied by Diaconis, Holmes 
and Janson [3] (see also 21]), who showed that they are exactly the graph 



limits that can be represented by kernels W that take values in {0, 1} only 
and are increasing, in that 

W(xi, yi) < W{x 2 , y 2 ) if < x x < x 2 < 1, < yi < y 2 < 1. (1.2) 

In other words, W is the indicator function of a symmetric increasing subset 
of [0, l] 2 . (In this paper, 'increasing' should always be interpreted in the 
weak sense, i.e., as 'non-decreasing'.) Moreover, the representation by such 
a W is unique, if, as is usual, we identify functions that are equal a.e. 

Note that the monotonicity properties in (jl.ip and (|1.2p are obviously 
related; this is perhaps best seen if ([Lip is rewritten as a monotonicity 
property of the adjacency matrix of the graph (with some exceptions at the 
diagonal), so even without the detailed technical study in 0], the condition 
(jl.2p should not be surprising. 

Increasing and decreasing kernels define the same set of graph limits, by 
the change of variables x t— >■ 1 — x. Hence we shall talk about monotone 
kernels rather than increasing kernels, but for simplicity (and without loss 
of generality) we consider only increasing ones, so in this paper 'monotone' 
is regarded as synonymous with 'increasing'. 

The main purpose of the present paper is to study the larger class of graph 
limits represented by arbitrary monotone kernels (taking any values in [0,1], 
rather than just the values and 1), and the corresponding sequences of 
graphs. We shall also study analytic properties of monotone kernels them- 
selves. 

Definition. Let W-i- be the set of monotone kernels on [0, 1], i.e., the set of 
all symmetric measurable functions W : [0, l] 2 — > [0, 1] that satisfy (jl.2p . 

Let be the corresponding class of graph limits, i.e., the class of graph 
limits that can be represented as Tyy for some W G WV. We call these graph 
limits monotone. 

By definition, every monotone graph limit can be represented by a mono- 
tone kernel W on [0, 1], but note that a monotone graph limit may also have 
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many representations by non-monotone kernels. For example, a monotone 
kernel can be rearranged by an arbitrary measure-preserving bijection from 
[0, 1] to itself, which will in general destroy monotonicity. 

The classes Wf of monotone kernels and of monotone graph limits 
are studied in Section [U We show there that VVf is a compact subset of 
Z^QO, l] 2 ), and that W t is a compact subset of U^. In addition, we consider 
monotone kernels defined on other (ordered) probability spaces, showing 
that each such kernel is equivalent to a monotone kernel on [0,1], so the 
class is not enlarged by allowing arbitrary probability spaces. 

Definition. A sequence (G u ) of graphs with \G U \ — > oo is quasimonotone 
if it converges to the set U^, in the sense that each convergent subsequence 
has as its limit a graph limit in IA^. In this case we will also say that (G v ) 
is a sequence of quasimonotone graphs. 

In particular, a sequence (G u ) converging to a graph limit in is quasi- 
monotone. Note that it makes no formal sense to ask whether an individual 
graph is quasimonotone; just as for quasirandomness, quasimonotonicity is 
a property of sequences of graphs. 

Example 1.1 (Threshold graphs are quasimonotone). As noted above, each 
convergent sequence of threshold graphs converges to a limit represented by 
a 0/1-valued kernel W E W-f. Hence every sequence of threshold graphs 
(with orders tending to oo) is quasimonotone. 

Example 1.2 (Quasirandom graphs are quasimonotone). Quasirandom graphs 
were introduced by Thomason (23,[1E] as sequences (G u ) of graphs that have 
certain properties typical of random graphs. A number of different such 
properties turn out to be equivalent, and there are thus many equivalent 
characterizations, see Chung, Graham and Wilson [6]. Another characteri- 
zation, found by Lovasz and Szegedy [20], is that a sequence (G v ) is quasir- 
andom if and only if it converges to a graph limit represented by a constant 
kernel W(x, y) = p, for some p 6 [0, 1]. (See also [l^] and [3].) Since a con- 
stant function is monotone, W £ Wf, and thus every quasirandom sequence 
of graphs is quasimonotone. 

Example 1.3 (Random graphs are quasimonotone). The sequence of ran- 
dom graphs G{v,p) with some fixed p G [0,1] and v = 1,2,... (coupled 
in the natural way for different v) is a.s. quasirandom, and thus a.s. quasi- 
monotone. 

Our main result ( Theorem 11.5 1 below) is that quasimonotone graphs can be 
characterized by a weakening of (jl.lj) . As is typical for conditions concerning 
convergence to graph limits, this weakening involves taking averages over 
subsets of the vertex set V, rather than imposing a condition for all vertices, 
and allows for a small 'error', making the condition asymptotic. 

Given a graph G with vertex set V = V(G), a vertex v of G and a subset 
A of V, let 

e(v, A) := \N(v) n A\ = \{w G A : w ~ v}\ 
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denote the number of edges from v to A. 

Let x+ denote the positive part of x, i.e., max{:r,0}. Writing n := |G| = 
\V\, given a (linear) order -< on V and a subset A C V, define 

n (G, -<, A):=^J2 ( e ( v > A \ M) " < w > A \ W)) + (1-3) 

= ^3 Z)( e ( U ^\{ U ^})- e K^\{«^})) + : ( L4 ) 

Q (G, -<) := max^ (G, -<,A), and (1.5) 
n (G) :=minO (G,-<). (1.6) 

In the last line the minimum is taken over all n! orders on V. The normal- 
ization by n 3 ensures that < Qq < 1. In fact, Qq < 1/2, and this bound 
can be improved further, but this is not important for our purposes since 
we are interested in small values of 

Note that Qq(G) = if and only if there exists an order -< such that 
Qo(G, -<, A) = for every A, i.e., e(v, A\ {v, w}) < e(w, A\ {v, w}) for all A 
and v -< w, which easily is seen to be equivalent to (II. ip . giving the following 
result. 

Proposition 1.4. A graph G is a threshold graph if and only if £lo(G) = 
0. ' □ 

Note that Oo is n °t intended as a measure of how far a graph is from 
being a threshold graph (for such a measure, see Section [8|). Rather, we 
may think (informally!) of a typical quasimonotone graph as being similar 
to a random graph in which edges are independent, and the probability pij 
of an edge ij is increasing in i and in j. In such a graph, one cannot expect 
the neighbourhoods of different vertices to be even approximately nested. 
But one can expect that for all 'large' sets A of vertices, for most i < j, 
e(i,A) will be smaller than (or at least not much larger than) e(j, A). The 
idea is that a small value of CIq(G) detects this phenomenon, without relying 
on any given labelling of the vertices. 

Some variations of the functional f^o will be defined in Section [31 where 
we shall show that they are asymptotically equivalent for our purposes. 

Our main result is the following, proved in Section [71 (All unspecified 
limits in this paper are taken as v — > oo.) 

Theorem 1.5. Let (G u ) be a sequence of graphs with \G„\ —> oo. Then 
(G v ) is quasimonotone if and only if CIq(G u ) —> 0. 

We state a special case separately. 

Theorem 1.6. Let (G u ) be a sequence of graphs with \G U \ — > oo, and sup- 
pose that (G u ) is convergent, i.e., G u — ^ T for some graph limit T £ Uqo- 
Then T if and only if Qq(G u ) — > 0. 
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We give several results on monotone graph limits in Sections HH3 These 
include a characterization in terms of a functional f2(W) for kernels, analogu- 
ous to f^o for graphs. Along the way we prove some results about monotone 
kernels that may be of interest in their own right. For example, on functions 
that may be written as the difference between two monotone kernels, the 
L 1 norm and the cut norm may be bounded in terms of each other; see 
Theorem 15.51 

Remark 1.7. Lovasz and Szegedy [12] have studied the class of graph lim- 
its represented by 0/1-valued kernels (and the corresponding graph proper- 
ties); with a slight variation of their terminology we call such graph limits 
random-free. In contrast to the monotone case, it can be shown that every 
representing kernel of a random- free limit is a.e. 0/1-valued; see [HI]. It 
follows that the graph limits that are both monotone and random-free are 
exactly the threshold graph limits. 

In Section we consider the functional obtained by taking the supremum 
over A inside the sum in (jl.3p instead of outside as in (jl.5p . We shall show 
that this stronger functional characterizes convergence to threshold graph 
limits instead of monotone graph limits; we call the corresponding sequences 
of graphs quasithreshold. 

1.1. A problem. The convergence G u — > T of a sequence (G u ) of graphs to 
a graph limit T can be expressed using the homomorphism numbers t(F, •): 
G v — > T if and only if t(F,G u ) — > t(F,T) for every fixed graph F; see e.g. 
[20 ]. for definitions and further results. In particular, the graph limit 
r is characterized by the family (t(F, T))p. The families (t(F,T))p that 
appear are characterized algebraically by Lovasz and Szegedy [201 ] . 

Problem 1.8. Characterize the families (t(F, T))p that appear for T G U+. 

The rest of this paper is organized as follows. In the next section we review 
some basic properties of the cut metric that we shall rely on throughout the 
paper. In Section [3] we introduce some variants of the functional f^o for 
graphs. In Section 0] we define analogous functionals for kernels and state 
several key properties; these are proved in the next two sections, and then 
our main results are deduced in Section [7J Finally, in Section [8] we discuss 
related functionals characterizing quasithreshold graphs. 

2. Kernels and graph limits 

We state here some standard definitions and results that we shall use later 
in the paper. For proofs and further details, see e.g. Borgs, Chaves, Lovasz, 
Sos and Vesztergombi 0] , Bollobas and Riordan [i[ , or Janson [lH, G3] ■ 

Let (S,F,/i) be a probability space; for simplicity, we will usually abbre- 
viate the notation to S or 

A kernel (or graphon) on S is a symmetric measurable function S 2 — > 
[0, 1]. We let W(5) denote the set of all kernels on S. 
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If W is an integrable function on S 2 , we define its cut norm by 

\\W\\ a := sup f W(x,y)f(x)g(y)d»(x)d t i(y) > C 2 - 1 ) 

ll/l|oo,||9l|oo<l 

where || • ||oo denotes the norm in L°°. In other words, the supremum in 
(|2.ip is taken over all (real- valued) functions / and g with values in [—1,1]. 
(Several other versions exist, which are equivalent within constants.) By 
considering the supremum over / with g fixed, and vice versa, it is easy to 
see that the supremum is unchanged if we restrict / and g to take values in 
{±1}, so we have 



|W||n = sup 

f,g:S->{±l} 



[ W(x,y)f(x)g(y)dfi(x)d^(y) . (2.2 



This norm defines a metric \\Wi — W2W0 f° r kernels on the same probability 
space S; as usual, we identify kernels that are equal a.e. 

The cut norm may be used to define another (semi) metric So, the cut 
metric, as follows. If ip : Si — > S2 is a measure-preserving map between two 
probability spaces and W is a kernel on £2, we let W v be the kernel on S\ 
defined by W v (x, y) := W(f(x), f(y)) ■ Let W\ be a kernel on a probability 
space S\ and W2 a kernel on a possibly different probability space 1S2. Then 

So(W 1 ,W 2 ):= inf ||Wf x - Wp\\o, (2.3) 

where the infimum is taken over all couplings (fi,f2) of Si and S%, i.e., 
over all pairs of measure-preserving maps fi : S3 — > S\ and f2 ■ S3 — » S2 
from a third probability space 53. It is not difficult to verify that So satisfies 
the triangle inequality (see e.g. [HI), but note that 5rj(Wi,W2) may be 
even if W\ 7^ W2, for example if W\ = W% for some measure-preserving 
<p : Si — > S2- Hence, So is really a semimetric (but is usually called a metric 
for simplicity). 

Note that 5o(Wi, W2) is defined for kernels on different spaces. Moreover, 
it is invariant under measure-preserving maps: So(W^ 1 , W^ 2 ) = <5n(Wi, W2) 
for any measure-preserving maps fj~ : S' k — > Sk, k = 1, 2. 

Although we allow couplings (fi, f2) defined on an arbitrary third space 
£3, in (12. 3p it suffices to consider the case when £3 = Si x S2, with a 
measure \i having marginals \xi and fj,2, taking for ipi and if 2 the projections 
7Tfc : Si x 5 2 — > Sk, k = 1,2. In fact, for an arbitrary coupling (if 1, if 2) 
defined on a space (S3, ^3), the mapping (fi, if 2) : 1S3 — > Si x 5 2 maps ^3 to 
a measure /i on 5i x £2 with the right marginals, and it is easily seen that 
HWf 1 - W$ 2 \\a = || W^ 1 - W^\\o- 

Although this will be of much lesser importance, we also define the cor- 
responding rearrangement-invariant version of the L 1 distance: 

Si(Wi,W 2 ):= inf \\W^-Wp\\ L i, s2) . (2.4) 

The coupling definition (|2.3h of the cut metric is valid for all Si and 1S2, 
but in common special cases it is possible, and often convenient, to use other, 
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equivalent, definitions. For example, if S\ = S2 = [0, 1] (equipped with the 
Lebesgue measure, as always), then as shown by Borgs, Chayes, Lovasz, Sos 
and Vesztergombi B Lemma 3.5], 



taking the infimum over all measure-preserving bijections [0, 1] — > [0, 1]. 

We say that two kernels W\ and W2 are equivalent if 5a(Wi,W2) = 0. 
The set of equivalence classes is thus a metric space with the metric <5rj. 
A central result (13, [B| is that these equivalence classes are in one-to-one 
correspondence with the graph limits. In other words, each kernel W defines 
a graph limit Tw, every graph limit can be represented by a kernel in this 
way, and two kernels define the same graph limit if and only if they are 
equivalent. Thus, the cut metric defines the same notion of equivalence 
as the one mentioned in the introduction. Furthermore, W\ and W2 are 
equivalent if and only if 5\{W\, W2) = 0, see e.g. 

Every kernel is equivalent to a kernel on [0, 1], so it suffices to consider 
such kernels. (We shall not use this restiction in the present paper, however.) 

One manifestation of the connection between graph limits and kernels is 
the following: If G is a graph with vertices labelled 1,2,. . . ,n, let Ao(i,j) '■= 
l{* ~ j} define its adjacency matrix, and let 



This defines a kernel Wg on [0, 1] (or rather on (0, 1], which is equivalent). 
A sequence of graphs with \G U \ — > 00 converges to the graph limit T = Tw 
if and only if 5 u {W Gu ,W) 0. 

Note that Wq depends on the labelling of the vertices of G, but only in 
a rather trivial way, and different labellings yield equivalent kernels. Here, 
in the study of monotone kernels, the ordering is relevant. If G is a graph 
with a given order -< on V, we therefore define Wg = Wg,^ as above, but 
using the labelling of the vertices with 1 -< 2 -< • • • , ignoring the original 
labelling, if any. 



In Section [T] we defined a functional fio that measures, in an averaged 
sense, how far the adjacency matrix of a graph is from being monotone. 
There are several natural variations of the definition; we shall concentrate 
on two. 

Firstly, in (|1 .3|) and (jl.4j) . we were careful to exclude v and w from the 
set A; this had the advantage of making £Iq(G) exactly zero when G is a 
threshold graph. But most of the time it is more convenient not to do this. 
Instead, we consider 



5 a (W 1 ,W 2 ) := inf \\Wi - Wf || n 



(2.5) 



W G (x, y) := A G ( \nx\ , \ny\ ) . 



3. Further measures of quasimonotonicity 




(3.1) 
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which differs from (jl.4p in that we count all edges into A, and not just the 
edges into A \ {v,w}. This changes each edge count by at most 1, so 



\n Q (G,^,A)-n 1 (G,^A)\<l/n. (3.2) 
As in (jl.5p and (jl.6|) . we set 

nx(G,-<) :=maxni(G,-<,A), and (3.3) 

Sli(G) :=minfii(G,-<). (3.4) 



Before turning to our second variant, let us note a basic property of Oo- 
Let e(v, A) denote the number of edges from v to A in the complement G c 
of G. If v ^ ^4, then e(u,^4) = |^4| — e(t>, ^4). Hence, for any v, w and ^4, 

e(w, A \ {i>, w}) — e(v, A \ {v, w}) = e(v, A \ {v, w}) — e(w, A \ {v, w}). 

From (|1.4p it follows that Q (G C ,>-,A) = Qo(G, -<, A), where, naturally, 
>~ denotes the reverse of the order -<. Thus Qo(G c ,y) = Qq(G,~<) and 

n (G c ) = n (G). 

For S7i one can show similarly, or deduce using (|3.2p . that \Qi(G c ) — 
rii(G)| < 2/n, say. 

Despite the above symmetry property of £lo, the following 'locally sym- 
metrized' version of the definition turns out to have technical advantages. 
Given a graph G, an order -< on V(G), and A C V(G), set 

n 2 (G, <,A) := Qi(G, A) + -<, V \ A), (3.5) 

n 2 (G, -<) := max tt 2 (G, A) (3.6) 

and 

2 (G) :=min0 2 (G,^). (3.7) 

Of course, we could define a corresponding symmetrization of S7o> but we 
shall not bother. 

It is easily seen that all our functionals Slj take values in [0, 1] (in fact, in 
[0, ^)). We have the following relations. 

Lemma 3.1. If G is a graph with \G\ = n, then 

p Q {G) -fii(G)| < 1/n, (3.8) 

and 

fii(G) < n 2 (G) < 2Qi(G). (3.9) 

Consequently, if(G u ) is a sequence of graphs with \G U \ — > oo, then £lj(G u ) — > 
for some j if and only if this holds for all j = 0, 1, 2. 

Proof. The inequality (|3.8p is immediate from (|3.2p . 
The definition (|3.5p implies that 

^i(G,-<) < n 2 (G,^) < 20i(G,-<), (3.10) 

which in turn implies ()3.9|) . □ 
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Remark 3.2. Instead of summing in (jl.4[) or ()3.ip . in analogy with the 
standard definition of e-regular partitions (see e.g. [2j, Section IV.5]), we 
may count the number of 'bad' pairs (y, w) of vertices v -< w where the 
difference e(v, A) — e(w, A) is larger than en, for some small e. This suggests 
the following definition: with -< an order on the vertex set V, n := \V\, and 
A a subset of V, set 

n[(G, ~<,A) := infje > : \{v -< w : e(v,A) > e(w,A) + en} \ < en 2 \, 

and define ^[(G) by taking the maximum over A with -< fixed, and then 
minimizing over -<. It is a standard observation that if xi, . . . ,x a take values 
in [0, ft], then ^ x\ > eab implies that there are at least ea/2 of the Xi that 
are at least eft/2, and that if at least ea of the X{ are at least eft, then the sum 
is at least e 2 ab. Using this it is easy to check that Q\ and Q\ are bounded 
by suitable functions of each other. In fact, it turns out that 

|Oi(G) < fii(G) < Qi(G) 1 / 2 . 

We can also define corresponding modifications of the other tlj. 

Remark 3.3. Proposition 11.41 says that a graph G is a threshold graph if 
and only if Q.q(G) = 0. This does not hold for Q±; in fact, if G contains an 
edge vw, with v ~< w, then Q±(G, -<, {w}) > n _3 e(f, {w}) = n~ 3 by (|3.ip ; 
hence f2i(G) > n -3 unless G is empty. Consequently, Qi(G) > for every 
non-empty graph G. On the other hand, Proposition 11.41 and Lemma 13.11 
show that Oi(Cr) < 1/n for every threshold graph. 

We defined each £lj(G) by taking the minimum of £lj(G, -<) over all pos- 
sible orderings -< of the vertices. As the next lemma shows, for f?2; ordering 
the vertices by their degrees d(v) := e(v,V) (resolving ties arbitrarily) is 
optimal. This is the main reason for considering f^. 

Lemma 3.4. Let < be an order on V such that v < w d(v) < d(w). 

Then 2 (G) = 2 (G,<). 

Proof. The inequality ^(G) < ^2(G, <) is immediate from the definition 
(13. 7p . so it suffices to prove the reverse inequality. 

Let -< be any order on V. If v < w, then e(v, V) = d(v) < d{w) = e(w, V) 
and thus, for A <^V, 

e(v, A) - e(w, A) = e(v, V) - e{w, V) + e(w, V \ A) — e(v, V\A) 

< e(w, V\A) — e(v, V \ A). (3.11) 

Let f(v, w, A) := (e(v, A)-e(w, A)) + and g(v, w, A) := f(v, w, A)+f(v, w, V\ 
A). By (pTTT]l . if v < w, then f(v, w, A) < f(w, v,V\A) and thus 



g(v, w, A) < f(w, v,V\A) + f{w, v, A) = g(w, v, A). (3.12) 
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Using (|3.12l) for v < w with v y w, we obtain 
n 2 (G, <,A) g(v, w, A) 

Y I 

= ^3 S 9 ^ 9(v, w, A) 



n J 1 — ' n u 

1)<UI V<W 



^ ^3^2a(v,w,A) + ^^2g{w,v,A) 

v<w w>v 

-3 9{v, w, A) = U 2 (G, -<,A). 



Hence, by (|3.6p . £l 2 (G, <) < ^(G, -<). Since -< is arbitrary, this yields 

n 2 (G,<) = n 2 (G). □ 

As an immediate consequence of Lemmas 13.41 and 13.11 we have the follow- 
ing result for 

Corollary 3.5. Let < be an order on V such that v < w ==> d(v) < d(w). 
Then Q^G) < Qi{G, <) < 2fii(G). 

Proof. By (IBTTOj) . Lemma S3] and (1331) . 

fii(G) < J2i(G, <) < n 2 (G, <) = n 2 (G) < 20i(G). 

(Alternatively, one can use a simplified version of the proof of Lemma 13.41 ) 

□ 

Using a symmetrized version of f^o, or otherwise, it is easy to prove the 
corresponding result for Qq. 

Remark 3.6. If G is regular, then any order < satisfies the condition of 
Lemma 13.41 and Corollary 13.51 so these results show that Vt 2 {G, <) is the 
same for all orders, and £l\(G, <) is the same for all orders within a factor 
of 2; the latter holds also for Qo- 

The factor 2 in Corollary 13.51 is annoying but not really harmful for our 
purposes. It is best possible, as shown by the following example. 

Example 3.7. Consider a balanced complete bipartite graph G = K m ^ m 
(so n = 2m), with bipartition (V\, V 2 ). Given an order -< on the vertex set 
Vi U V 2 , let Nij := \{(x, y) £ V{ x Vj : x -< y}\. Note that 

N 12 + N 21 = \Vi x V 2 \ = m 2 . (3.13) 

Let A C V = Vi U V 2 and let a» = \A D Vi\, i = 1, 2. Then e(v, A) = a 2 if 
v € V% and e(v,A) = a\ if v € V 2 . Hence, 

,3 



i n l (G,^,A) = Y,{<v,A)-e(w,A))_ 



n 

N 12 (a 2 - ai ) + + N 21 { ai -a 2 ) + . (3.14) 
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Since a\ and 02 can be freely chosen in {0, ...,m}, we have a\ — (12 G 
{— m, . . . , m}, and maximizing over A yields 

n 3 ^i(G, -<) = mmax{iVi2,iV2i}. (3.15) 

If -<i is an order with all elements of V\ coming first, then ./V12 = m 2 and 
N21 = 0, and thus 

«i(G,-<i) = m 3 /n 3 = 1/8. 

On the other hand, if m is even and -<2 is an order which starts with m/2 
elements of V% , continues with all of V2 , and finishes with the remaining half 
of Vi, then N12 = N21 = m 2 /2, and thus 

^(G, -< 2 ) = \m 3 jn 3 = 1/16. (3.16) 

Thus f2i(G, -<i) = 2J7i(G, ^2) although G is regular and Corollary 13.51 ap- 
plies to every order. 

For Oo, the ratio between Qq(G, -<i) and Oo(C, -< 2 ) is 2— 0(l/n) by (pT2l) . 

Note that for any order -<, (|3.13p implies max{iVi 2 , iV 2 i} > m 2 /2, and 
thus ([335]) yields 

fii(G) > n- 3 m 3 /2 = 1/16. (3.17) 

Consequently, if m is even, then (13.160 shows that 

Sll(G) = Slx(G, -<2) = 1/16 (meven). (3.18) 

On the other hand, if m is odd, then since N12 + N21 = ni 2 is odd, for 
any order -< we have max{iVi2, N21} > (m 2 + l)/2, and this is attained for 
some -<. Thus ()3.15p now yields 

n x (G) = n~ 3 m(m 2 + l)/2 > 1/16 (m odd). (3.19) 

We thus have 

^i(-Km,m) = 1/16, m even, 

^i(^ m ,m) = (1 + m" 2 )/16 > 1/16, modd. 

For Q25 the situation is simpler. It follows from (|3. 14[) that n 3 £li(G, -<, 
V \ A) = ^12(02 — ai)_ + A?2i(ai — 02)-, and thus, using (|3.13p . 

n 3 n 2 (G,^,A) = N 12 \a 2 - ai| +iV 2 i|ai - a 2 | = m 2 |ai - a 2 \. (3.21) 

Maximizing over A we find 2 (Cr, -<) = m /n 3 = 1/8 for every order -<, cf. 
Remark and thus 2 (G) = 1/8. 

If we modify G by adding a perfect matching inside V2 (assuming m is 
even) then every order < satisfying the condition of Corollary 13.51 is of the 
type -<i. The added edges change each e(v, A) by at most 1, and thus each 
Qj(G, -<,A) is changed by at most 1/n. Hence this yields an example where 
Qj(G, <) = (2 — 0(l/n))Qj(G) for j = 0, 1, for every order < considered in 
Corollary 13.51 
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4. Monotone kernels and graph limits 

We begin by extending the definition of monotone kernels to other prob- 
ability spaces. 

Definition. An ordered probability space (S, -<) = (S,F,fM,-<) is a prob- 
ability space (S,J-,fi) with a (linear) order -< that is measurable, i.e., 
{(x, y) : x -< y} is a measurable subset of S x S. 

Note that it follows that {(x, y) : x >~ y} and {(x,y) : x = y} are measur- 
able. 

All orders considered in this paper are assumed to be measurable, even if 
we only sometimes say so explicitly. Similarly, we only consider subsets and 
functions that are measurable. 

The standard example of an ordered probability space is [0, 1] with Lebes- 
gue measure and the standard order. [0, 1] is always equipped with these 
unless we say otherwise. 

Definition. Let (5, -<) be an ordered probability space. A monotone kernel 
on (5, -<) is a kernel W : S 2 — > [0, 1] such that 

W(xi,y{) < W{x 2 , y 2 ) if xi < x 2 , yi di V2- (4.1) 

Let yV-[(S, -<) be the set of monotone kernels on (S, -<), noting that W-j- = 
VV^([0, 1]). We shall prove the following properties of W^(«S, -<) in Sections 
and El 

Theorem 4.1. Let (S, -<) be an ordered probability space. 

(i) W^(5,-<) is a compact subset of L x (5 2 ). 

(ii) Two kernels in W-|-(5, -<) are equivalent if and only if they are a.e. 
equal. 

(iii) Themetrics \\W 1 -W 2 \\ L i, 5 1 {W 1 ,W 2 ), \\Wi-W 2 \\ u , and <$b(Wi, W 2 ) 
are equivalent on Wf(«S, -<), i.e., induce the same topology. 

Recall that denotes the set of monotone graph limits, i.e., the class of 
graph limits that can be represented as for some W G W-f = Wf ([0, 1]). 

Corollary 4.2. Each monotone graph limit has a representation as Tyy for 
some W £ = W^([0, 1]) with W unique up to equality a.e. Furthermore, 
there is a homeomorphism between and Wf([0, 1]), regarded as a subset 

«/^([o,i] 2 )- 

Proof. Immediate from Theorem 14. II and the fact that the metric on the set 
of graph limits is equivalent to So on the corresponding kernels. □ 

In Section [T] we defined U-\ as the set of graph limits that can be rep- 
resented by some W £ Wf ([0, 1]). The following theorem shows that we 
may allow monotone kernels on arbitrary ordered probability spaces with- 
out changing Uf, i.e., 

W t = {r : 3 (5, -<) and W G W t (5, -<) such that V = T w }. 
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This version of the definition is perhaps more natural than considering [0, 1] 
only; on the other hand, it is often convenient to use [0, 1]. 

Theorem 4.3. Let (S, -<) be an ordered probability space, and let W S 
W+(S, -<). Then there is a monotone kernel W £ Wf([0, 1]) that is equiva- 
lent to W . Equivalently, Tjy £ 

We shall next define two quantitative measures of how far a kernel is from 
being monotone, in analogy with (|1.3j) — ()1.6|) (or, more closely, (|3.ip . (|3.3p 
and ([331)), and (|33]) - ([377D . 

Given W G L 1 (S 2 ), a (measurable) order -< on S, and a (measurable) 
subset A of S, set 

fii(W, -<,A):= 

r W(x,z)dfj,(z)- [ W(y,z)d^z)) dn(x)dn(y), (4.2) 

n 2 (W,^,A) :=n 1 (W,<,A) + n 1 (W,<,S\A), (4.3) 
and, for j = 1, 2, 

-<) := sup Slj(W, -<,A), (4.4) 
ACS 

Slj(W) :=inf%(W,^), (4.5) 

where the infimum is over all measurable orders on 5. Note that 

fii(W) < O 2 (W0 < 20i(W). (4.6) 
For A C 5, let Wa(^) := £4 W ( x > z ) d K z )- Tnen P^j) can be written as 

ni(W,-{,A) = jj^ (w A {x)-W A {y)) + d^{x)d^y). (4.7) 
Remark 4.4. It is easily seen that 



ni(W^)=sup / // {W(x,z)-W(y,z))f(x,y)g(z)d^x)d f i(y)d^z), 

f,g J J Jx^y 

where the supremum is taken over all / : S 2 -»■ {0, 1} and g : S -»■ {0, 1}, 
and that allowing all / : 5 2 — >■ [0, 1] and g : S ^ [0, 1] yields the same result. 
Thus Oi(W,-<) can be seen as a one-sided version of the cut norm of the 
function (W(x, z) — W(y, z)) l{ x ^ y \ on S 2 x S. 
Similarly, ^(W, -<) equals 

sup Iff {W(x,z)-W(y,z))(f l (x,y)g(z) + f 2 (x,y)(l-g(z))) 

flj2,g J J Jx^y 

■dn(x)dn(y)dii(z), (4.9) 

where the supremum is taken either over all fi, f 2 ■ <S 2 — > {0, 1} and g : S — > 
{0, 1}, or over all f 1} f 2 :S 2 ^ [0, 1] and g : S -> [0, 1]. 



14 



BELA BOLLOBAS, SVANTE JANSON, AND OLIVER RIORDAN 



In the light of (|4.6H . Q\ and O2 are essentially equivalent. In particular 
f2i(W) = -s=>- £l2(W) = 0. When the difference is not important, we 
simply write f2; formally, this may be read as fix. Occasionally, there are 
advantages to considering one or the other variant. 

Theorem 4.5. Let (5, -<) be an ordered probability space and let W be a 
kernel on S. Then U(W, -<) = if and only ifW is a.e. equal to a monotone 
kernel. 

As noted above, Qj, j = 1,2, is an analogue of ttj defined earlier for 
graphs. Indeed, there is a simple relation. 

Lemma 4.6. If G is a graph with an order -< on the vertex set V , and 
< denotes the standard order on [0,1], then Qj(W(j,<) = fij(G, -<) for 

i = 1,2. 

For we shall show that Lemma [4.61 implies a corresponding result after 
minimizing over the relevant orderings. 

Lemma 4.7. If G is a graph, then ^(Wq) = 0.2(G). 

Note that Wq depends on the labelling of the vertices in G, but this is 
harmless since the different versions differ by measure-preserving bijections 
of [0, 1] (in fact, permutations of subintervals) and obviously have the same 

Remark 4.8. Let G = K m ^ m as in Example 13 .71 Then Wq does not depend 
on m, and one can check that 0,i(Wq) = 1/16. For m odd, we have f2i(G) > 
1/16 by (gZEgJ) . Thus we can have Oi(W G ) < fii(G). It seems likely that 
the difference is bounded by some function tending to as n — > 00, but 
we have not proved anything stronger than Qi(Wg) < £l\(G) < 2Qi(Wg), 
which follows from Lemma 14.71 and the relationship between Oi and fl2- 

Remark 4.9. Given a graph G, define Wq as the adjacency matrix of G, 
regarded as a kernel on V = V(G), which we regard as a probability space 
with the uniform probability measure (each point has mass 1/|G|). It is 
easily verified that Q±(Wq, -<,A) = Qx(G,~<,A) for every order -< on V 
and every set A C V. Hence J2i(W£, -<) = Sli{G, -<) for every order -< and 
fii(W^) = fii(G), and the same holds for fio- 

Note that Wq and Wq are equivalent kernels. It follows from Lemma [4.71 
that n 2 (W^) = n 2 (W G ), but Remark SSI shows that ^i(W^) > ^i(W G ) if 
G = i^ m ,m with m odd. (See also Corollary 16.71 and Remark 16.81 below.) 

Remark 4.10. In (14. 5ft . we take the infimum over all measurable orders on 
S. In general, this may be problematic, since there are probability spaces 
with no measurable orders, see Example 14.121 below. In such cases, we 
interpret (|4.5|) as £lj(W) = 00 (or perhaps 1), but this has the unhappy 
consequence that two equivalent kernels W\ and W2 may have Q.2(Wi) / 
^2{W2). For example, let W\ and W2 both be constant 1/2, with W\ defined 
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on [0, 1] and W% on a space S with no measurable order; then ^(Wi) = 
and ^(Wi) = oo. In the sequel we therefore consider only S that have at 
least one measurable order. Even in this case, equivalent kernels may have 
different see Remark 14.91 We will show in Corollary 16.71 that there is 
no such problem for The case f2(W) = is covered by the following 
theorem. 

Theorem 4.11. Let W be a kernel on a probability space S with at least 
one measurable order. Then the following are equivalent. 

(i) tt(W) = 0. 

(ii) There exists a measurable order -< on S such that W is a.e. equal 
to a monotone kernel on (S, -<). 

(hi) W is equivalent to a monotone kernel on some ordered probability 
space. 

(iv) W is equivalent to a monotone kernel on [0, 1] . 

(v) T\\r is a monotone graph limit. 

Example 4.12. Let S = [0, 1], but equipped with the cr-field Fq consisting 
of the subsets of S that are either countable or have a countable complement. 
For the measure [i we take the restriction of the Lebesgue measure to J- . 
(Thus, fi(A) = if A is countable, and fi(A) = 1 otherwise.) 

Let C be the family of countable subsets of S. The cr-field T x T is 
contained in the cr-field 

{AC5 2 : 351,^2 G C such that A or S \ A C (B x x S) U (S x B 2 )}. 

Thus, if -< is a measurable order, then there exist B\,B% G C such that 
either 

{(x,y):x^y}C(B 1 xS)U(SxB 2 ) 

or 

{(x,y):x^y}C(5 1 x5)U(5x5 2 ); 

in the latter case we have 

{(x,y) :x^y}c {(x,y) : x ± y} C (B 2 x S) U (S x B x ). 

However, in both cases we find that if we choose two distinct x,y ^ (B1UB2), 
then neither x -< y nor y < x holds, which is a contradiction. Thus (5, J 7 , /j,) 
is a probability space supporting no measurable orders. 

5. Proofs of Theorems I4.1H4.3I 

A downset in an ordered set (S, -<) is a subset A such that if x ~< y and 
y 6 A, then x G A. We begin with two lemmas concerning simple (and 
certainly well-known) properties of downsets; for completeness we give full 
proofs. 
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Lemma 5.1. (i) // A and B are downsets in a linearly ordered set (S, -<), 
then A C B or B C A. 

(ii) If A and B are downsets in an ordered probability space (S, -<) with 
H,{A) < 11(B), then Ad B. 

Proof, (i): Otherwise there would exist x E A\B and y E B \ A, but then 
neither x ~< y, y ~< x nor x = y is possible. 

(ii): Now B C A is impossible, and the result follows by (i). □ 

Lemma 5.2. If (S, -<) is an ordered probability space without atoms, then 
for every t E [0, 1] there exists a downset D(t) with fi(D(t)) = t. Further- 
more, D(t) C D(u) when t < u. 

Proof. It suffices to prove the first statement; the second then follows by 
Lemma I5.1( ii) . 

For x E S, let D x be the downset {y E S : y -< x}. Let X = Xq, Xi,X2, ■ ■ ■ 
be an i.i.d. sequence of random points in S (with the distribution fj,). Since 
there are no atoms, P(Xj = Xj) = for all i ^ j. Thus, for every n, 
Xq, . . . , X n are a.s. distinct, and by symmetry, all (n+ 1)! orderings of them 
have the same probability l/(n + 1)!. Hence, 

n ) i 

E(n(D x ) n )=F(X 1 ,...,X n ^X ) = r —, 77 = — t, n>l. 
x (n + 1J! n + 1 

Consequently, fj,(Dx) has the same moments as the uniform distribution 
17(0, 1), and thus fi(D x ) ~ U(0, 1). 

It follows that the set {(i(D x ) : x E S} is a dense subset of [0,1]. Hence, 
for every t E (0, 1], there exists a sequence (xj)j in S such that fj,(D Xi ) /* t 
as % — > oo. Then D Xi C D x for i > 1 by Lemma l5,lT ii). and we can take 
D(t) := USiT'zi) which is a downset with /j,(D(t)) := lim^oo /j,(D Xi ) = t. 
For i = we take D(0) := 0. □ 

Given an integrable function W on S 2 and i,B C 5 with fJ,(A) , n(B) > 0, 

let 

1 



mA < B) ■= mm 1 Lb w[x ' v) d " (I) My) (5 - 1) 

denote the average of W over A x B. If V = {^4j} is a finite partition of 5, 
we say that a function on S 2 is a V-step function if it is constant on each 
set A; x Aj. (A siep function on 5 2 is a P-step function for some finite 
partition V.) If W E 7 1 (5 2 ), we let W-p be the T'-step function defined by 

W P (x,y) = W{A i ,A j ) for xeAi,yeAj. (5.2) 

If some Aj has measure 0, then Wj> is not defined everywhere, but it is 
always defined a.e., which suffices for us. Note that Wj> is the conditional 
expectation of W given the u-field T-p x J--p, where T-p is the finite c-field on 
S generated by V. It follows that \\W V \\ U < \\W\\u and HW-pIIli < ||V^||i,x. 
If W is a kernel, then W-p is also a kernel. A kernel that is also a step 
function, such as W-p, is called a step kernel. 
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Suppose now that (S,fi, -<) is an atomless ordered probability space, and 
let D(t), < t < 1, be an increasing family of downsets in S with /j,(D(t)) = t 
as in Lemma E21 with D(0) = and D(l) = S. 

For n > 1 and i = 1, . . . , n, define 

Ai = A m :=D(i/n)\D((i-l)/n). (5.3) 

Then V n := {A n i}i is a partition of 5 into n sets of the same measure 1/n. 
Furthermore, if i < j, then A ra j -< A n j, meaning that if x S A n i and y G j4 n j, 
then x -< y. 

Given a kernel W on 5, let w^j" := W(A n i, A n j) and let W„ be the step 

kernel Wp n ; thus W n = on A n j x ^4 nj -. Define the step kernels 
,(«) „„j w-/„ ,A — „„(«) 



by W+(x,y) := and W n (x,y) := on A ni x A nj , where 

w ij^ = if i or j = and u^™ = 1 if i or j = n + 1. 



If W is monotone, then the matrix (w^)ij is increasing along each row 
and column, and thus W n is a monotone step kernel. 

Lemma 5.3. Let W be a monotone kernel on an atomless ordered probability 
space (S, -<). Then W~ <W < W+ , W~ <W n < W+ and 

Wn ~ W\\ LHS2) < \\W+ - W-\\ L i {s2) < 4/n. 

Proof. If (x,y) £ A ni x A nj and (x',y') G A n , i+1 x A nJ - + i (with i,j < 
n — 1), then W(x,y) < W(x',y'), and averaging over (x',y') it follows that 
W(x,y) < = W^(x,y). This inequality evidently holds also if i or 

j = n. Hence W < W+. Similarly, W > W~ . 

Averag x A n j, it follows that W n < W n < W+. (This 

also follows directly from the monotonicity of w^j .) Consequently, \W n — 
W\<W+- W~, and thus 



\\w n - whfm < ji ( wi - w-) = ^ wj - ^ 

n+1 n— 1 n+1 n+1 

-2 ( n ) -2 ("•) ^ r, -2 ( n ) 

i,j=2 i,j=0 i=n j=2 

< 4/n. □ 

Trivially, for any kernel W we have ||W||n < ||I^ / ||l 1 (5 2 )- m general there 
is no reverse inequality. However, if V is a partition of 5 into n sets and W 
is a P-step function, then it is trivial to bound ||W||l1(s2\ from above by 
a polynomial times [|W[|rj. Indeed, one can write ||W||£i(sa) as a sum of n 
integrals of the form in (|2.ip , in each taking g to be 1 on one part of V and 
zero elsewhere, and choosing the sign of / on each part appropriately. In 
fact, the correct polynomial order is ^/n, as shown in flU ]. 
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Lemma 5.4. Let S be a probability space and V a partition of S into n sets. 
IfW is a V -step function, then \\W\\ L i( S 2^ < y/2n\\W\\a. Furthermore, for 
any W S L 1 (S 2 ) we have 



\W-p\\Li(s2) < V2n\\W\\n- 



(5.4) 



Proof. It suffices to prove the first statement; the second follows immedi- 
ately, since W-p is a "P-step function, and ||Wp||n < [|W[|rj. 

The statement and proof are (essentially) present in Remark 9.8 of 
Nevertheless, let us write out the proof. 

In 1930, Littlewood [18] proved that there is a constant c < v3 such that 
for any re-by-n array of real numbers ay we have 



E(E 

i=l j=l 



a i 



i j 



1/2 



< 



c max 



,EE 



ei,e'.=±- . . 

»=1 J=l 



£iSj(l{j 



c max 

£i=±l 



^ e i a *. 



j=l i=l 



c max 



i=l j=l 



EjOij 



Later it was noticed (see [27|], Ch. 5 and [1]) that this inequality of Little- 
wood's could be deduced from a special case of an inequality that had been 
proved some years earlier by Khintchine [Hi ]. In 1976, Szarek 24] proved 
that the best constant in Littlewood's inequality (in fact, in the correspond- 
ing ine q u ality of Khintchine) is \pl. For some related results, see, e.g., j§], 

El O, M and 0- 

As noted in [lj], using the Cauchy-Schwartz inequality and Littlewood's 
inequality, with the constant c = y/2 proved by Szarek, it follows that 



EE 

i=l j=l 



<Hj\ < 



n n 

E» I/2 (E 



1/2 



< V2n 



i=l 



3=1 



2n max '^2'^2 e i £ 'j a ij- 



i=l j=l 



Returning to the proof of Lemma 15.41 let the parts of V be A\ 



(5.5) 



and set ay = /x(Aj)/i(^4j)Wij, where Wij is the value of W on Aj x Aj. Then 



Li(52) 



E 



ij \ Q ij\ 



In the definition (12. ip of the cut norm, restricting 



our attention to functions f,g:S 
find that 



{±1} that are constant on each Aj, we 



\W 



!□> max 



(in fact, equality holds), so the result follows from (15.5 



□ 



As noted in [14l |. it is easy to check that the factor V2n is best possi- 
ble apart from the constant, for example by considering 0/1-valued kernels 
associated to random graphs. For arbitrary monotone kernels, the lemmas 
above allow us to bound the L 1 -norm in terms of the cut norm. 
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Theorem 5.5. If W\ and W 2 are monotone kernels on an ordered proba- 
bility space (5,-<) ; then 

||Wi - W 2 \\ms>) < 10||Wi - W 2 \\ 2 ^. (5.6) 

Proof. Suppose first that S is atomless. Let n > 1 and consider the partition 
V n = {A n i}i defined in f)5.3j) and the step kernels Wk t n = (Wk)v n ) k = 1,2. 
Lemma 15.41 yields 

\\Wi, n - W 2 , n \\v{s*) = \\(Wi - W 2 )vJmsi) < V2n\\Wi - W 2 \\ n . (5.7) 

By Lemma f5.3l we have [| Wjfe — W^Hl^s 2 ) — 4/n, so by the triangle in- 
equality 

II - W 2 \\ L i [S 2) < \\W 1>n - W 2)n \\ L i {s2) + 8/n < v^||Wi - W 2 \\ n + 8/n. 

The result for atomless S now follows by choosing n := |~||Wi — W^Hrj 1 < 
2||Wi - W 2 ||n 2/3 . (In the case - W 2 \\ n = 0, we let n -»• 00.) 

If S has atoms, we consider the atomless probability space S := S x [0, 1] 
with the lexicographic order. Let tt : S — > S be the projection onto the first 
coordinate and let Wk ■= W£ be the extension of Wk to S. The proof just 
given applies to 5, and thus 

l|Wi-W 2 || L i (sa) = ll^i-^ll^^) < 10||t?i-t? 2 ||n /3 = 10||iyi-W 2 ||n /3 . 

□ 

Example 5.6. It is easy to see that (|5.6p is tight apart from the con- 
stant. Indeed, let S be the discrete probability space with n equiprobable 
elements {0,1,..., n — 1}, and choose two 0/1-valued kernels on S with 
\\Wi - Waling) = 0(1) and \\Wi - W 2 \\n = O^ 1 / 2 ). For example, we 
may take kernels corresponding to two independent instances of the ran- 
dom graph G(n, 1/2). Let W be the function defined by W(i,j) = i + j. 
Then it is easy to see that W[ = (Wj + W) / (2n) is a monotone kernel for 
each i. Since \\W{ - W'^x^) = ||Wi - W 2 \\ L x {s *)/{1n) = O^" 1 ) and 
|| W[ — W^Wn = || Wi — II / 2||n/(2n) = G(n~ 3//2 ), this gives monotone kernels 
W{ and W' 2 with \\W[ - W'^x^) = ©Gl^i - ^llcf)- 

Our next aim is to prove the rather unsurprising fact that if we start from 
two monotone kernels, then 'rearranging' one or both does not bring them 
any closer in the L 1 distance. First we need a preparatory lemma; this can 
be viewed as a continuous, coupling version of the trivial observation that 
if we wish to minimize Y%=1 \ a i ~ b{\ ( or ! equivalently, ^( a « ~~ where 
the values in each sequence are given but we are allowed to permute them, 
then we should sort both sequences into ascending order. 

Lemma 5.7. If hx,h 2 : S — >• M are increasing integrable functions on an or- 
dered probability space (S, fx, -<), and ipi , cp 2 : S' — > S are measure-preserving 
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maps from a probability space (S 1 , pi') to (S,pi), then 

(hf 1 - hf) + dpi' > J (hi- h 2 )+ dpi (5.8) 
and Whf 1 - h^ 2 \\ L i^ > \\hi - h 2 \\ L i (s) . 

Proof. For any integrable function on any measure space we have 1 1 /i 1 1 ^ i = 
J {h) + + J (—h) + , so it suffices to prove the first statement. 

For any function / and real number t, let Bf(t) := {x : f{x) < t}. Fubini's 
theorem yields 



5 



(hi - h 2 



r f'OG 

dfi= / l{hi{x) > t>h 2 (x)}dtd^(x) 

J S J —co 

l{x £ B h2 (t)\B hl (t)}dfi{x)dt 
fx(B h2 (t)\B hl (t))dt. 



-oo J S 

oo 



Similarly, 



(hr-h?) + dn'= / n'[B h * 2 (t)\B K1 (t))dt. 



Since the pi are measure preserving, we have pi' (B^t (t)) = pi' (cp i 1 (By H (i))) = 

pi(Bh i (t)). Since h\ and h 2 are increasing, B^ (t) and B^ 2 (t) are downsets, so 
by Lemma [5.11 thev are nested. The result follows by noting that pi(X\Y) > 
(pi(X) — pi(Y)) + , with equality if X and Y are nested. □ 

Lemma 5.8. If Wi and W 2 are monotone kernels on an ordered probability 
space (S,-<), then 6i(Wi,W 2 ) = \\Wi -W 2 \\ L i {S 2y 

Proof. Suppose that (pi,(f 2 are measure-preserving maps S' — > S for some 
probability space (5',//). Then, using Lemma [5771 on each coordinate sepa- 
rately, 

\\wr-wp\\ L1{{sr) 

\Wi(<px(x),<pi(y)) - W 2 ((p 2 (x),ip 2 (y))\ dpi' (x) dpi' (y) 



S' Js 



> 



> 



S' JS 



W 1 (t,i Pl (y)) - W 2 (t,<p 2 (y))\ dpi(t)dpi'(y) 
W x (t,u) -W 2 (t,u)\dpi(t)dpi(u) = \\Wi- W 2 || £l(5a) , 



IS JS 

where for the last step we first apply Fubini's Theorem to change the order 
of integration. The result follows by the definition (|2.4p . □ 

With a little more work, we obtain a corresponding result for the cut 
norm and cut metric. Unfortunately, we need to consider a variant of the 
definition. 



MONOTONE GRAPH LIMITS AND QUASIMONOTONE GRAPHS 21 

If W is an integrable function on S 2 , let 



\W\\ n ,i := sup 

/, ff :S-K0,l} 



W(x,y)f(x)g(y)dii(x)dn(y) , (5.9) 



where the supremum is over all pairs of measurable 0/1-valued functions 
on S. (We could equally well consider functions taking values in [0, 1]; the 
value of the supremum does not change.) Expressing each of the functions 
f,g in (|2.2[) as the difference of two 0/1-valued functions, we see that 

\\W\\ D ,i < \\W\\ D <4||W|| n ,i, (5.10) 

so for all questions concerning convergence, the norms are equivalent. 
In analogy with <^3\j, given W% 6 L l (S 2 ), i = 1,2, let 

<fa,i(Wi,W 2 ):= mf \\Wp - Wp\\ n>1 , (5.11) 

¥>1>¥>2 

where, as in (|2.3p . the infimum is taken over all couplings (</?i, <^ 2 ) of <Si and 

s 2 . 

Lemma 5.9. If W\ and W2 are monotone kernels on an ordered probability 
space (5,-<), then 5 n< i(W 1 ,W 2 ) = \\ Wi - W 2 \\ a ,i. 

Proof. Suppose that (fi,(f2 are measure-preserving maps S' — > S for some 
probability space S'. It suffices to show that WWf 1 — W^lln.i > II ~~ 
W 2 \\u,i- 

Given a probability space (<S,/x), an integrable function W on S 2 , and 
two functions f,g : 5 — )■ {0,1}, set 



If,g(W):= / W(x,y)/(x) 5 (2/)d/z(x)d/i(y), 
is 2 

so ||W||n,i = sv l?f,g\If,g(W)\. Swapping W\ and W 2 if necessary, we may 
assume that \\W\ — W2 = sup^_ 9 I/ lS (Wi — W2). Hence, fixing (arbitrary) 
functions /, g : S — > {0, 1}, it suffices to prove that 

sup/ /w (Wf 1 - Wp) > IfjW! - W 2 ), (5.12) 

f',9' 

since HW^ 1 — W^Hn is at least the left-hand side. 

The first statement (|5.8p of Lemma 15.71 says exactly that if h\ and hi are 
increasing, integrable functions on (S,fj,,-<) and cpi, if2 ■ (<S',/z') — )■ (S,/jl) 
are measure-preserving, then 

^l(^l(x))-h 2 ^ 2 (x)))/'(x)d/x / (x) 



max 

/':S'->{0,1} Js> 

> max (hAt) -h 2 (t))f(t)du(t), (5.13) 
/:5^{o,i}y 5 v ' 

where the maximization is over all {0, l}-valued functions on the relevant 
space; the corresponding supremum is clearly attained. We shall use this 
inequality twice; in particular, we shall twice use the observation that a 
specific / on the right is 'beaten' by some /' on the left. 
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Let hi(t) = f s Wi(t,u)g(u)d/j,(u). Then (since g(u) is non- negative) , hi 
is monotone. Applying (the observation following) f)5. 13j) to these functions 
and our function /, we find that there is some f':S'—> {0, 1} such that 

{Wi{i Pl {x),u)-W2{^(x),u))g{u)dpi{u)\ f'(x)dfi'(x) 



> J U^Wx^u) - W 2 (t,u))g(u)dv(u)\ /(i)ty(t) = If, 9 (Wi - W 2 ). 

Using Fubini's Theorem, we may rewrite the left-hand side as 

I- J (J {Wxifpxix^u) - W 2 ( V2 (x),u))f'(x)d^{x)^ g(u) d^u). 

Let h'^u) = j s ,Wi((fi(x),u)f'(x)dfi'(x). Then the h\ are again monotone, 
so applying f|5. 13|) to these functions and g gives a g' : S' — )• {0, 1} such that 

(Wi (<pi (x) , <p x (y) ) - W 2 { V2 (x) , (p 2 (y))) f'(x) dfi'(x)) g' (y) dfx'(y) > I 



But now the left-hand side is simply If^iW^ 1 —W^ 2 )-, so we have If'tfiW^ 1 — 
Wp) > I > If^Wx -W 2 ), establishing (|5TI2]) . □ 

In the light of (|5,10p . Lemma 15.91 has the following immediate corollary. 

Lemma 5.10. // W\ andW 2 are monotone kernels on an ordered probability 
space (5,-<), then 5 a (Wi,W 2 ) > \\W\ - W 2 ]|n/4. □ 

It seems plausible that 5n(Wi, W 2 ) = \\W\ — W 2 \\a for monotone kernels, 
but we do not have a proof (or indeed a strong feeling that this is actually 
true). 

We are now ready to bound the L 1 distance with 'rearrangement' in terms 
of the cut metric, when the kernels in question are monotone. 

Lemma 5.11. If W\ andW 2 are monotone kernels on an ordered probability 
space (S, fi, -<), then 

Si(W u W 2 ) < 26 5 n (W u W 2 ) 2/3 . (5.14) 

Proof. Combining Lemma l5.8( Theorem 15.51 and Lemma 15.101 we have 

Si(Wi,W 2 ) = \\Wi - W2|| L i (52) < 10||Wi - W 2 \\^ 3 < 10(4fa(Wi,IF 2 )) 2 / 3 , 

(5.15) 

giving the result. □ 

Remark 5.12. Using Theorem 14.31 (which is proved below), Lemma 15.111 
immediately extends to monotone kernels defined on possibly different or- 
dered probability spaces. 

Remark 5.13. The exponent 2/3 in fj5. 14f) is best possible, as shown by the 
kernels W[, W% in Example 15 .61 Indeed, for these kernels, the first inequality 
in (|5.15p is tight up to the constant. The second inequality is always tight 
up to the constant 4 2 / 3 since, by definition, 5a(Wi,Wz) < \\Wi — W 2 \\u- 
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We are now ready to prove the first few results in Section [H 



Proof of Theorem \4-l\ The equivalence of the different metrics in (iii) fol- 
lows from Theorem [531 Lemmas 15,81 and 15. 101 (see (]5.15p ) and the inequality 
Sn(W 1 ,W 2 ) <5i(W 1 ,W 2 ). 

As a special case, for two kernels W%, W 2 £ W^(S), 

5u{W 1 ,W 2 ) = || Wi - W 2 \\ L i { s*) = Wi = W 2 a.e., 



which establishes (ii 



For (i) , we show that 14^(5) is closed and totally bounded as a subset of 
L 1 ^ 2 ). First, if W v G W t (<S) and W„ -> W in L 1 ^ 2 ) as i/ -> oo, then there 
is a subsequence that converges a.e. to W, and replacing W by the limsup 
of that subsequence, we see that W £ W-f(«S). Hence, Wf (<S) is closed. 

Next, first assume that S is atomless. By Lemma |5.3( for every n there 
is a partition T^n such that for every kernel W £ Wf(<S), there is a T^-step 
kernel W n with \\W — W n \\ L irg2-\ < 4/n. If F n is the finite set of "P n -step 
kernels taking values in {0, -, ^, . . . , 1}, then there always exists a W' n G F n 
with ||W„ — W /7 ^|| i i( < 52) < 1/n, and thus ||W — W^H^i^-j < 5/n. Since n is 
arbitrary, this shows that W^(5) is totally bounded. 

If S has atoms, we consider as above S = S x [0, 1] and it : S — > S; then 
W i — y W n is an isometric embedding of L l {S 2 ) into L l (S 2 ). This embeds 
W-f (5) into Wf(«S), and since the latter is totally bounded, VVf (<S) is too. □ 

Proof of Theorem \4.3\ If 5 has atoms, we replace it, as above, by S = S x 
[0,1]; thus we may assume that 5 is atomless. By Lemma 15.31 there is 
a sequence of step kernels W n that converges to W in L l (S 2 ). Each W n 
is obviously equivalent to the monotone step kernel W' n on [0, 1] defined 
by W' n = on I; x Ij, where Ii := ((i — l)/n,i/n). We have \\W n — 
^mlli 1 ([o,i] 2 ) = \\W n — W m \\Lirs2\, and thus (W/J is a Cauchy sequence in 
L^fO, l] 2 ). Hence there is some W such that W' n -t W in L^O, l] 2 ), and 
Theorem 14. l[(i)| implies that W G W^([0, 1]). For every n, 



Sd(W, W) < 6 D (W, W n ) + 5 u (W n , W' n ) + <5 D «, W) 

< ^ + o + IK-iy , || L1([0 ,i ] 2 ) . 

Since W' n -»• W in L x ([0, l] 2 ), it follows that IF') = 0, so W and W 

are equivalent. □ 

6. Proofs of Theorems l4.5M4.lll 
In this section we prove the remaining results in Section (H namely, The- 



orem [431 Lemmas 14.61 and 14. 7\ and Theorem 14.111 

We start with a technical lemma, which is fairly obvious but nevertheless 
deserves to be stated precisely. 
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Lemma 6.1. Suppose that (<5i, jui, -<j) and (S%, /J>2,~<2) are ordered prob- 
ability spaces, and that Si x 52 is equipped with a probability measure \i 
such that the projection tt\ onto Si is measure-preserving. Let -<J be the 
lexicographic order on Si x 52- If W is a kernel on Si, then for j = 1,2, 

n j (w,^i) = n j (w w \^* 1 ). 

In most applications, we take ji = /xi x fi 2 . 

Proof. Writing x G S := Si x S 2 as x = {x x ,x 2 ), by (g3J), Vt x {W' K1 ,^\) is 
equal to 

sup (W(xi,zi) - W(yi,zi))f(x,y)g(z)dfi(x)dfi(y)da(z), (6.1) 

f,g J J Jx-<ly 

where the supremum is over all / : S 2 — > [0, 1] and g : S — > [0, 1]. 

Let J-i be the cr-field on S obtained by pulling back that on S\. Thus 
the J-i-measurable functions are all functions of the form h{x\, x 2 ) = ^i(^i) 
for measurable hi on S\. In (16. ip we may replace / and g by their condi- 
tional expectations given Ti x Ti and Ti, respectively. Recalling that -<* is 
lexicographic, and noting that the integrand vanishes when xi = yi, (|6.ip 
reduces to 

sup /// (W(xi, zi)-W(yi, zi)) fi(xi,yi)gi(zi) dui(xi) dm(yi) dui(zi), 

fi,gi J J Jxi<iyi 

with the supremum over fi'.Sf^- [0, 1] and g\ : S x — > [0, 1]. By (|4.8p . this 
is simply fii(W, -<i). 

(In the special case when u = \x\ x u 2 , the argument above is equivalent 
to simply integrating over x 2 ,y 2 ,z 2 in (16. 1 H - ) 

For Q 2 , the argument is similar, using (14, 9h instead of (|4.8p . □ 

Proof of Theorem \4-5\ Here it makes no difference whether we consider Qi 
or Q 2 , so we simply write O. 

If W = W' a.e. where W' is monotone, then we have f2(W, -<, A) = 
n(W, -<, A) = for all AOS, and hence Sl(W, -<) = 0. 

Conversely, suppose that Q,(W, -<) = 0. Let A,B,C,D C 5 have positive 
measures, and suppose that yl -< 5. Since fi(W, -<) = 0, we have Q(W, -< 
, C) = and thus by (|4.7p Wc(x) < Wc(y) for a.e. with x -< y, and 

in particular for a.e. (x,y) £ Ax B. Averaging over all such (x,y) yields 
W(A,C) < W{B,C). Similarly, by symmetry, if C -< D, then W{B,C) < 
W(B,D). Consequently, letting A X B mean A -< B or A = B, 

W(A,C) <W(B,D) ii A<B,C <D. (6.2) 

Assuming still that A,B,C,D C 5 have positive measures, suppose that 
A -< B and C < D. If At C A and Ci C C, then (|6?2"]) . applied to 
A\,B,C\,D, yields 

/ / W < (/i x x d)W(B, D). 

J JA 1 xCi 
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Since every measurable subset of A x C can be approximated (in measure) 
by a finite disjoint union of rectangle sets Ai x C%, and W is bounded, it 
follows that 




W < {n x n)(E)W{B, D) for every ECAxC. 



Takings := {(x,y) G AxC : W(x,y) > W(B,D)}, we obtain /j.x/j.(E) = 0, 
and thus 

W(x, y) < W{B, D) a.e. on Ax C when A -< B and C -< D. (6.3) 

Similarly, by reversing the inequalities, 

W(x, y) > W{B, D) a.e. on Ax C when A y B and C y D. (6.4) 

Suppose now that S is atomless, and consider, for a given n, the partition 
V = (Ai)^ defined in (JO]). By ([62]), W n := W v is a monotone kernel. 
By ([OD and (JS31), W~(x,y) < W(x,y) < W+(x,y) a.e. on each A { x Aj, 
and thus a.e. on S 2 . Further, by averaging this or directly from (j6.2|) . also 
W~ < W n < ■ It follows as in the proof of Lemma 15.31 that 

\\W n - W\\ LHsa) < 4/n. (6.5) 

Now consider the sequence W 2 k, k > 1. By (|6.5p and the Borel-Cantelli 
lemma, or by the martingale convergence theorem, W 2 k —> W cL.C clS k — > oo. 
Hence, if we define W := lim sup^^ W 2 k , then VF = W a.e. and W is a 
monotone kernel. This completes the proof when S is atomless. 

If S has atoms, we may either modify the argument above, or use our 
standard trick of replacing S by S x [0, 1], using Lemma I6.lt this gives a 
monotone kernel W on S x [0, 1] with W'((x,a), (y,b)) = W(x,y) for a.e. 
(x,a,y,b) £ (S x [0, l]) 2 , and thus W is a.e. equal to the monotone kernel 
W" on S defined by W"(x, y) = ft ft W'((x, a), (y, 6)) da db. □ 

Proof of Lemma \4.6\ Let := ((i — l)/n,i/n] and for ^4 C [0,1], set Ai : = 
ADli. For j = 1, 2, by and (jHHJ), %(Wg, <, -A) depends only on the 
numbers Oj := jU(-Aj) G [0, 1/n]; moreover, since the function u i— >• u+ is 
convex, Qj(W G , <,A) is a convex function of (ai, . . . , a n ); hence it attains 
its maximum when each m is either or 1/n. In other words, it suffices to 
consider A = IJieB ^ ^ or some B <^ V. In this case, it is easily seen that 
nj(W G ,<,A) = Uj(G, -<,B), noting that f A W G (x,z)dz = f A W G (y,z)dz 
if x, y G Ii for some i. The result follows by taking the maximum over 
B C V. □ 

Lemma 6.2. Let (<S, -<) &e an ordered probability space, and let j G {1,2}. 

(i) // W\ , W 2 G L 1 ^ 2 ), &en 

+ W2, -A) < ty{Wi, -<, A) + Qj(W 2 , -<, A), 

sij(Wi + w 2 , -<) < nj(Wi, -<) + nj(w 2 , -<). 

(ii) 1/ W G i 1 ^ 2 ), tften %(IV, -<) < j\\W\\ n . 

(iii) // W h W 2 G L x (5 2 ), «ien <)-Slj(W 2 , <)\ < j\\W x -W 2 \\ u . 
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Proof, (i): An immediate consequence of the inequality (a + b) + < a + + b + 
for real a and b, and the definitions (I4.2p ^( l4~4j) . 
(ii): By (]4.7f) and Fubini's theorem, 

ili(W,*,A)< ft (\W A (x)\ + \W A (y)\)df,(x)dfx(y) 

J J x^y ^ ' 

= \ ^{y ■ V >- x}\W A (x)\ d/i(x) + / fi{x : x -< y}\W A (y)\ dn(y) 
Js Js 

= / [i{z : z ^ x}\W A (x)\ d/j,(x) < \W A (x)\ d/j,(x) 
Js Js 

W(x,y)f(x)g(y)dn(x)dfi(y) < \\W\\ a , 



Is 2 

where f(x) := sign(W^(x)) and g(y) := l A (y); the final inequality follows 
from the definition (12. If) of the cut norm. Now apply (I4.3j) . if j = 2, and 
take the supremum over A. 

(iii): A simple consequence of (i), applied to the sums W\ + (W2 — W\) 
and W 2 + (Wi - W 2 ), and (ii). □ 

The function Ws = Ws{x) := j s W(x, y) d/i(y) is known as the marginal 
of W. (There is also a second marginal, obtained by integrating over the first 
variable. Here we consider only symmetric functions, so the two marginals 
coincide.) It is well known that the marginal of a kernel is the natural 
analogue of the degree sequence of a graph, see e.g. 0]. We have the following 
analogue of Lemma 13.41 

Lemma 6.3. Let < be a (measurable) order on S and assume that x < 
y => W s (x) < W s (y). Then n 2 (W, <) = Sl 2 (W). 

Proof. Follow the proof of Lemma 13.41 replacing sums by integrals and de- 
grees by the values of Ws- □ 

Remark 6.4. For fix, it follows by flMD that fii(W,<) < 2Q 1 (W). The 
factor 2 here is best possible, just as in Corollary 13.51 This can be seen by 
taking W = Wq where G is the complete bipartite graph K m ^ m considered 
in Example 13.71 

Corollary 6.5. Let S be a probability space and W a kernel on S. Then 
£1 2 {W) = if and only if there exists an order -< on S such that £l 2 (W, -< 
) = 0. 

Proof. The 'if direction is clear. Thus, assume 0, 2 (W) = 0. Then there 
exists a measurable order -<o ° n <5. Define an order ^ on 5 by 

x -< y if W s (x) < W s (y) or (W s {x) = W s (y) and x< y). (6.6) 

This is a measurable order to which Lemma [6.31 applies, so Q 2 (W, -<) = 
n 2 (W) = 0. □ 

Of course, the same result for Qi follows by (|4.6p . 
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Proof of Lemma \4^\ Recall that Wq = Wq,^ depends on a labelling of the 
vertices of G, via the associated order -< on V(G). However, S1 2 (Wg,<) is 
independent of the order -<. 

For any order -< on V = V(G), Lemma 14.61 shows that, using -< to define 
Wg, and writing < for the standard order on [0,1], we have Q 2 (Wg) < 

n 2 (iF Gl <) = n 2 (G, -<). Thus n 2 (w G ) < n 2 (G). 

Conversely, let -< be an order on V such that v ~< w => d(v) < d(w), and 
use this order to define Wg- Then Wg satisfies the assumption of Lemma [BT3l 
with the standard order < on [0, 1], and thus Q 2 (Wg, <) = U 2 (Wg)- Hence, 
by Lemma 14.61 again. 

n 2 (G) <n 2 (G,<) = n 2 (w G ,<) = n 2 (w G ). □ 

Our next lemma shows that Q, 2 is continuous with respect to the cut 
metric. 

Lemma 6.6. If W\ and W 2 are kernels on probability spaces S\ andS 2 , and 
there exists a measurable order onS\, then Q. 2 {Wi) < Q 2 (W 2 )+25n(Wi, W 2 ). 

Proof. Recall that the set of step functions is dense in L l (Sl). Hence, for 
any e > 0, there exists a step kernel W[ on <Si with \\W± — W{\\o < \\W\ — 
W[\\ L i^ S 2-j < e. By Lemma 1631 m). replacing W\ by W[ changes 02 (Wi) by 
less than 2e, and the same holds for 5\j(Wi, W 2 ). Hence, it suffices to prove 
the result when W\ is a step kernel. 

Consequently, assume that W\ is a P-step kernel, for a finite partition 
V = (Ai)i of <Si. Then its marginal Wx,s is constant on each Aj, and we 
may assume that A\,A 2 ,... are labelled such that W\ t s(x) < Wi t s(y) if 
x G Ai, y G Aj with i < j. Let -<o be a measurable order on S%, and define 
<\ by 

x -<i y if x E Ai and y G Aj with (i < j or (i = j and x -<o y)). 

Let -<2 be any measurable order on S 2 . Consider a coupling (711, 7r 2 ) 
defined on («Si x S 2 ,fj,) for some \x. Let -<\ be the lexicographic order on 
S\ x S 2 , and let -< 2 be the lexicographic order with the factors in opposite 
order. By Lemma 16. 11 

n 2 {w k ,^ k ) = n 2 (w*\^* k ), k = i,2. (6.7) 

Moreover, Lemma 16.31 applies to -<J and W^ 1 and shows that 

n 2 (w?\<i) = n 2 (w^) < n 2 (wp,^), (6.8) 

and by Lemma l6.2f iii). 

n 2 (W?\^* 2 ) < VL 2 {WP,<* 2 ) + 2\\W^ - W?\\ u . (6.9) 
Combining (|6.7j) — (|6.9p . we find 

fi2(Wi,-<i) < n 2 (W 2 ,-< 2 ) + 2\\W? 1 - WJ 2 ^, 

and the result follows by taking the infimum over all couplings such (ttx, tt 2 ), 
i.e., over all probability measures with the right marginals, and then over 
all orders < 2 . □ 
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Corollary 6.7. If W\ and W 2 are equivalent kernels on probability spaces S\ 
andS2 that have measurable orders, then Q 2 (W\) = ^2(^2); and \'£l\(W<2) < 

n 1 (Wi)<2n 1 (w 2 ). 



Proof. We have <5n(Wi,W 2 ) = 0; the first statement follows by Lemma ESI 
To deduce the second, use (|4.6|) . □ 

Remark 6.8. The equivalent of Lemma 16.61 for Q\ does not hold, and 
the inequalities |^i(W 2 ) < ^i(Wl) < 2n 1 (W 2 ) in Corollary E3 are best 
possible. In fact, if W m := Wg is the kernel defined in Remark 14.91 for 
the bipartite graph K m ^ m , then W m is equivalent to Wic m m (defined on 
[0,1]), but WK m , m is the same for all m. Hence, all W m are equivalent. 
Nevertheless, Remark [431 and (|3.20[) show that Qi(W m ) = £li(K m>m ) = 
(1 + m~ 2 )/16 if m is odd, while £li(W m ) = Q,\{K mm ) = 1/16 if m is even. 
In particular, fii(Wi) = 1/8 = 2Q 1 (W 2 ). 

On the other hand, for kernels W±, W 2 on the standard space S = [0, 1] 
(and thus for kernels on any atomless Borel spaces), it follows from (|2.5p 
and Lemma E2^iii) that - ^i(W 2 )| < 5n{Wx,W 2 ), since clearly 

QiiW^) = fii (W 2 ) for a measure-preserving bijection 92. In particular, 
^i(Wi) = ^1(^2) for any two equivalent kernels on [0, 1]. Hence the unruly 
behaviour of Qi is caused by the atoms. 



Proof of Theorem \4U\ =» p)} We use Q 2 . If 2 (PF) = 0, then by 
Corollary 16.51 there exists an order -< on S such that £l 2 (W,-<) = 0, and 
Theorem 14.51 shows that W is a.e. equal to a monotone kernel on (S, -<). 
Trivial. 



in. 
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(i) If W is equivalent to a monotone kernel W on some proba- 



bi 



by Lemma 16.61 



ity space S', then 8 a (W,W) = and tt 2 (W) = 0, and thus U 2 (W) = 



(rH)1<j=>Kiv)1 <^>[(v)| By TheoremEfl □ 

7. Proof of Theorems I1.5H1.6I 

After the preparation above, the proofs are simple. 

Proof of Theorem ] Let W be a kernel on [0, 1] representing T, i.e., T = 
T w and G v -»• W. Since G„ -> W, we have 6n(W Gv ,W) -> 0. 

Suppose first that T G W^; we then may choose £ W|, and thus 
^ 2 (Ty, <) = so n 2 (W) = 0. Then, by Lemmas O and E3 

JVC) = n 2 (WoJ < sh(w) + 2<5b(WG„, = 26 a (w G „, w) -> o. 

Hence Q 2 (G U ) — > 0, and by Lemma ETTl Oo(G y ) — > as well. 

Conversely, suppose that Qq(G u ) — > 0, and thus by Lemma IXTl fWG,,) — )• 
0. Then, by Lemmas 16.61 and 14.71 again, 

n 2 (w) < n 2 (WGj + 2fc(WG I , J w r ) = n 2 (G u ) + 25 D (w G „,w) -> o, 

and thus H 2 (W) = 0. Hence, r = IV G W-f by Theorem (4TTJ □ 
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Proof of Theorem \1.5[ If Qq(G u ) —> 0, then the same holds for every subse- 
quence. Hence Theorem 11.61 shows that every convergent subsequence has a 
limit that is in U^, which by definition says that (G u ) is quasimonotone. 

Conversely, suppose that (G u ) is quasimonotone but Qo(G u ) -ft 0. We 
can then find e > and a subsequence along which Qq(G u ) > e. By re- 
stricting to a suitable subsubsequence, we may further assume that (G v ) 
converges to some limit T. By the assumption that (G v ) is quasimonotone, 
r G £/f and thus by Theorem 11.61 &o(G u ) — > along the subsubsequence, a 
contradiction. □ 

8. QUASITHRESHOLD GRAPHS 

In the definition (jl.5p of Qo(G, -<), we take the maximum over A of the 
sum in (jl.3p . If instead we take the maximum inside the sum, then we 
obtain the functional 

nS(G, -<) := ^ £ \N(v) \ (N(w) U M)| , (8.1) 

v-<w 

since max J 4(e(t>, A \ {w}) — e(w,A \ {v})) is obtained by taking (for ex- 
ample) A = N(v) \ N(w). From Q\, we similarly obtain the slightly simpler 
functional 

«i(G,-0 : =^sEl iV ^)\ iV H| =n* (G,^) + O(l/n). (8.2) 

For a kernel W on an ordered probability space (S,[a, -<), taking the 
supremum over A inside the double integral in (|4.2p . we define 

n*(W,^):= fff (W(x,z)-W(y,z)) + d^x)dn(y)dfi(z). (8.3) 

(Cf. (|4.8p .) For any graph G with an ordering -< of the vertices, correspond- 
ing to Lemma 14.61 we have 

n*(W G ,<) = ni(G,<). (8.4) 

Obviously, Qq(G, -<) > &o(G, -<), and similarly for and Q*. 
Let 

n*(G) :=mmn*(G,^) (j=0,l), Sl*(W) := inf Sl*(W, ■<). (8.5) 

-> -< 

For kernels, we can use instead of 0, to characterize monotonicity, cf. 
Theorems 03] and 14.111 

Theorem 8.1. Let (5,/i, -<) be an ordered probability space and W a kernel 
on (5,/x). Then Q*(W, -<) = if and only ifW is a.e. equal to a monotone 
kernel. 

Proof. If W is a.e. equal to a monotone kernel, then W(x, z) < W(y, z) for 
a.e. (x,y,z) with x ~< y, and thus Q*(W, -<) = 0. The converse follows by 
Theorem Ufa since fii(W, -<) < n*(W, ■<). □ 
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Theorem 8.2. Let W be a kernel on a probability space S with at least 
one measurable order. Then Cl*(W) = if and only ifW is a.e. equal to a 
monotone kernel on (5,-<) for some order -< on S. 

Theorem 14. 1 1 1 gives further equivalent conditions, for example that Tyy is 
a monotone graph limit. 

Proof. If n*(W) = 0, then fii(W) = 0, since ^(W) < Hence the 

conclusion follows by Theorem 14.111 

Conversely, if W is a.e. equal to a monotone kernel om (S,~<), then 
n*(W) < n*(W, -<) = by Theorem EU □ 

For a sequence of graphs, we cannot replace by in Theorem 11.51 
In fact, we have the following result, which shows that Qq(G u ) —> char- 
acterizes threshold graph limits rather than monotone graph limits. (Recall 
that threshold graph limits are the monotone graph limits that correspond 
to 0/1-valued kernels; see Remark 11.71 ) 

As usual, we define the edit distance d e (G, G') of two graphs on the same 
vertex set V(G) = V{G') by d c (G, G') = \E(G) A E{G')\. If A is a class of 
graphs, then 

d e (G,A) := inf {d e {G,G') : G' 6 A and V{G') = V(G)}. (8.6) 

Theorem 8.3. Let (G v ) be a sequence of graphs with \G„\ — > oo. Then the 
following are equivalent. 

(i) %{G V ) -+ 0. 

(ii) Every convergent subsequence of (G u ) has a limit that is a threshold 
graph limit. 

(iii) d e {G u ,T) = o(|G^| 2 ), where T is the class of threshold graphs. 

(iv) There exists a sequence of threshold graphs G' v with V(G' U ) = V(G V ) 
and\E(G u )AE(G' u )\=o(\G u \ 2 ). 

(v) There exists a sequence of threshold graphs G' u with V{G' U ) = V(G U ) 

and \\Wg v ~ Wg'JIli(«s 2 ) = 

(vi) There exists a sequence of threshold graphs G' v with V(G' U ) = V(G V ) 
and\\W Gv -W Gl \\ u = o{l). 

We say that a sequence (G u ) of graphs with \G U \ — > oo is quasithreshold 
if it satisfies one, and thus all, of the conditions in Theorem 18.31 

As a special case of the equivalence (i) <?=^ (ii), we see that if G v — > T, 
then r is a threshold graph limit if and only if Qq(G u ) — > 0; cf. Theorem 1 1.61 

The proof of Theorem 18.31 is simpler than the proof of Theorem II. 5} but 
we will nevertheless need some other results first. One complication is that 
there is no analogue of Lemma f6.2( iii): as is shown by the following example, 
fJ*(W, -<) is not continuous for the cut norm. 

Example 8.4. Let W = 1/2 be constant on [0, l] 2 , and let (G n ) be a 
sequence of graphs with \G n \ = n and G n — > W, i.e., (G n ) is a sequence of 
quasirandom graphs. (E.g., let G n be random graphs G(n, 1/2).) Then, for 
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every e > 0, | \N(v) \ N(w)\ — n/4| < en for all but o(n 2 ) pairs (v, w) G V@ , 
and thus for any order -< , |n 3 0^(G n , -<) - n 3 /8| < en 3 + o(n 3 ), so |fJ*(G n , -< 
) — 1/8| < e + o(l). Since e is arbitrary, it follows that 

n*(w Gn ) = ni(G n ) -> I ^ = 

although ||W Gn - W|| n -)• 0. 

fi* is obviously continuous in the stronger L 1 norm. It is possible to prove 
Theorem 18.31 using this fact and Lemma 18.131 below, but it is simpler to use 
another extension of to kernels. 

Definition. If is an atomless probability space and -< an order on S, 

let 



Q*(W,^):= W(x,z)(l-W(y,z))dn(x)dn(y)dn(z). (8.7) 

J J J x^,y 

If S has atoms, we add half the integral over x = y (and any z), i.e., we add 
\ JfW(x,z)(l - W(x,z))v{x}dn(x) da(z). 

The definition in the case that S has atoms is such that £l*(W,<) = 
Q*(W, -<), where W is the extension of W to the atomless probability space 
S := S x [0, 1] and -< is the lexicographic order on S. 

Note that if W is 0/1-valued, then n*(W, -<) = fi*(W, -<). In particular, 
for any graph with an order -< on V = V(G), by (|8.4p . 



ni(G,<) = n*(w G ,<) = n*(w G ,<). (8.8) 

For our purposes Q* is better than 0* in two different ways. The first 
is that, unlike fi*, fi* is continuous with respect to the cut norm. Before 



proving this, we recall a basic property of the cut norm. (See e.g. 14] for a 
proof.) 

Lemma 8.5. If W € L 1 ^ 2 ), then \\W s \\ L i {s) < \\W\\ D . □ 

Recall that, by definition, a kernel W takes values in [0, 1]. 

Lemma 8.6. Let (5, -<) be an ordered probability space. If W\ and Wi are 
kernels on S, then ln*(Wi,-<) - h*(W 2 ,^)\ < 2[|W a - W 2 ||n. 
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Proof. We may assume that S is atomless. (Otherwise we consider S x [0, 1].) 
In this case, writing U x for {y : y y x}, we have the alternative formula 



n*(w,-<) 



W(x,z) f ,(U x )d f i(x)d f i(z) 

W(x, z)W(y, z) d/i(x) dfi(y) dfj,(z) 




n(U x )W(x, z) d/j,(x) d/j,(z) 




W(x, z)W(y, z) d/j,(x) d/j,(y) d^(z) 



n{U x )W{x, z) d\x{x) d\x{z) - - \ W s {z) 2 dfi(z). (8.9) 



< ||Wi-W 2 || n . 



By the definition (|2.ip of the cut norm, 

J f n(U x ) [W x (x, z) - W 2 (x, z)) dfi(x) dn(z) 
Recalling that | Wj \ < 1 and using Lemma 18.51 on W\ — W 2 

[W ltS (z) 2 - W 2 , s (z) 2 ) dv(z) 



(W^z) - W %s {z)) (Wx,s( z ) + W 2 ,s(z)) M*) 

< 2\\W liS (z) - W 2 , s (z)\\ L i {s) < 2\\Wi - W 2 \\ u . 
Applying (18.91) to W\ and W 2 , the result follows. □ 

Theorem 8.7. Let (5, -<) be an ordered probability space and W a kernel 
on (S, -<). Then Q,*(W, -<) = if and only ifW is a.e. equal to a 0/1-valued 
monotone kernel. 

Proof. As usual, we may assume for simplicity that S is atomless. Suppose 
first U*(W,^) = 0. For a > 0, let E a := \{x,y) G S 2 : a < W(x,y) < 1-a}, 
and, for z G 5, let E a (z) := {x G S : (x,z) G E a } be the corresponding 
section. 

If x,y G E a (z), then W(x,z)(l — W(y,z)) > a 2 , and thus, for each z, 



W{x,z)(l-W(y,z))dn(x)dn(y) 



> a 2 ii x n{{x,y) G E a (z) 2 : x < y) = -a 2 n(E a {z)f 



= n*(W,^)> / -a 2 ^E a {z)) 2 d^{z), 



Hence, 



and thus fj,(E a (z)) = for a.e. z, so fj, x fi(E a ) = f s fi(E a (z)) d/i(z) = 0. 
Consequently, E a is a null set for every a > 0. Hence, W(x,y) G {0,1} 
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a.e. Thus W is a.e. 0/1-valued, which implies that Q*(W, -<) = Q*(W, -< 
) = 0; hence Theorem 18.11 shows that W is a.e. equal to a monotone kernel 
W'. Finally, W is a.e. 0/1-valued, and thus a.e. equal to the 0/1-valued 
monotone kernel 1{W' > 0}. 

The converse is obvious. □ 

We also have an analogue of Lemma [6.31 To prove this, we shall need the 
following 'rearrangement' inequality. 

Lemma 8.8. Let -< and < be two orders on an atomless probability space 
S, and let f be a bounded function on S. If x < y ==>- f(x) < f(y), then 

fS x ^ v f(x)M*)Mv)>SJ x<y f(*)Mx)Mv)- 

Proof. Consider first one arbitrary order -<. Let D y := {x : x -< y} and set 
<p(y) ■= n(D y ), and let D(t) be as in Lemma |5~2"1 Then D y and D{ip{y)) are 
two downsets with the same measure, and thus they differ only by a null 
set, cf. Lemma 15. 11 

Let F(y) := J x<y f(x) dfi(x) and define a(t) := f D ^ f(x) d[i(x). Then 



F{y) = f= f = <*(v(y))- 

JD V JD( V (y)) 

It was noted in the proof of Lemma 15.21 that if X has distribution fj,, then 
<p(X) has distribution U(0, 1). Equivalently, the function ip : S — > [0,1] 
maps fi to the uniform measure on [0,1]. Hence, 

f(x)dfi(x)dfi(y)= [ F(y)dfi(y)= [ ct{ip(y)) d^y) = [ a(t)dt. 

x<y JS JS JO 

Now write a = and compare £*_<(£) and a < (t). Both are integrals of 
/ over sets of measure t, and for a< the set is such that if x is in the set 
and y is not, then x < y and thus f(x) < f(y). It follows easily that 
a < (t) is the minimum of J E f d/j, over all set E of measure t, and thus in 
particular a < (t) < a^(t) for any other order -<. Consequently, a < (t) dt < 
Jq a^(t) dt, and the result follows. □ 

Lemma 8.9. Let < be a (measurable) order on S and assume that x < 
y =>. W s {x) < W s (y). Then, 0*(W,<) = n*(W). 

Proof. We may again assume for simplicity that S is atomless. Let -< be 
any order on S. We again use (|8.9p . which we write as 

&*(W,^) = f n(U x )W s (x)dii(x) -^Jwsixfd^x). 

The second integral does not depend on -<. Moreover, the first integral 
equals JJ x ^ y Ws(x), which by Lemma [8781 is minimized by taking -< equal 

to <. Hence U*(W, -<) > fl*{W, <), and the result follows. □ 



31 



BELA BOLLOBAS, SVANTE JANSON, AND OLIVER RIORDAN 



Remark 8.10. It follows by (|8.8f) that the corresponding result holds for 
graphs and fi*: i.e., ordering the vertices by their degrees achieves the min- 
imum min^ Cli(G, -<). 

Our next result shows that f2* characterizes kernels that yield threshold 
graph limits. Note the parallel and contrast to Theorems 14.111 and | 



Theorem 8.11. Let W be a kernel on a probability space S with at least 
one measurable order. Then the following are equivalent. 

(i) Q*(W) = 0. 

(ii) There exists an order -< on S such that W is a.e. equal to a 0/1- 
valued monotone kernel on (<S, -<). 

(iii) W is equivalent to a 0/1-valued monotone kernel on some ordered 
probability space. 

(iv) W is equivalent to a 0/1-valued monotone kernel on [0,1]. 

(v) T\y is a threshold graph limit. 



Proof, (i) ==> (ii) There exists a measurable order -<o on 5. As in the proof 
of Corollary I6.5| we define an order -< on S by (|6.6p . Lemma 18.91 applies 
and yields tt*(W, -<) = fl*(W) = 0^ and the result follows by Theorem EH 
lfi]=>[fi51 TheoremEZlyields fi*(W, -<) = and thus tt*(W) < tt*(W, -< 



0. 



(ii) <^=^ (iii) <J=^- (iv) Every kernel equivalent to an a.e. 0/1-valued ker- 



nel is itself a.e. 0/1-valued, see Remark 11.71 and [141 ] . Furthermore, arguing 
as in the proof of Theorem 18.7} a monotone kernel W that is a.e. 0/1- 
valued is a.e. equal to the 0/1-valued monotone kernel 1{W > 0}. Hence, 
(ii) <^=^ (iii) <^=^ (iv) follows from the corresponding equivalences in Theo- 



rem 



(iv) 



(v) . As noted in the introduction, this was proved by Diaconis, 

□ 



Holmes and Janson [7]. 

We need some more preparation before the proof of Theorem 

Lemma 8.12. Let W\ and W2 be kernels on a probability space S with W\ 
0/1-valued, and let W{ be a 0/1-valued step kernel with n steps. Then 

\\Wi - W 2 \\ LHS 2 ) < n 2 ||Wi - W 2 \\d + 2\\Wi - W{\\ LH s>y 

Proof. Let {Aj}" be a partition of S such that W[ is constant or 1 on each 
Aj x Aj . 



If W{ = on Ai x Aj, then 



\W[ - W 2 



x j4.j 



A. j x yl a 



W 2 < \\W 1 -W 2 \\d + 



\\W 1 -W 2 \\ a + 



j4 j x j4 j 



A. 2 X -An 



\Wx-W{\. 



EW{ = 1 on Ai x Aj, then 
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\W[-W 2 \= [f (1-W 2 ) < \\Wi- W 2 \\n+ II 

= ||Wi-W 2 || D + // |Wi-W-[|. 

Thus, in both cases JJ AiXA . |W^-W 2 | < JT^. |Wi-Wfl + ||Wi-W 2 [b 
and summing over all i and j yields 

\\W[-W 2 \\ L i < \\W 1 -W{\\ L i +n 2 \\W 1 -W 2 \\a. 

The result follows by \\Wi - W 2 \\ L i < \\Wi - W[\\ L i + \\W{ - W 2 \\ L i. □ 

Lemma 8.13. Let W and W%, W 2 , ■ ■ ■ be kernels on a probability space S, 
and assume that W is 0/1-valued. Then \\W n — W\\u — > as n — > 00 if and 
only if \\W n - W\\ L i^ -)• 0. 

Proof. Assume \\W n — W||n — >• 0. W is the indicator function 1a of a 
measurable set A C 5 2 . Any such set can be approximated in measure by a 
finite disjoint union of rectangle sets IJ, Ai x Bi, and we may assume that 
this set is symmetric since A is; in other words, given any e > 0, there 
exists a 0/1-valued step kernel W' such that \\W — V^'Hl 1 < £ - Let the 
corresponding partition have N = N(e) parts. Lemma 18.121 then yields 

\\W - W n \\ L i < N 2 \\W - W n \\o + 2e^2e 

as n — > 00. Hence, limsupn^oo \\W — W n ||£i = 0. 

The converse is obvious. □ 



Proof of Theorem \8.3[ Note first that (i) is equivalent to Q*(G U ) — > by 
and that fiJ(G„) = n*(W G J by 



=> (ii) Assume (i) and consider a subsequence that converges. We 
thus assume that there exists a graph limit T with G v — > T. Let W be a 
kernel on [0,1] representing T. 

We have G v — > W, and thus 5\j(W Gir , W) — > 0. Moreover, by 0, Lemma 
5.3] we may choose the labelling of the vertices in G v such that 

\\W Gv -W\\ n ^0. (8.10) 

This labelling yields an order < on V(G U ). Let -< be an order on V(G V ) 
achieving the minimum in (|8.5p for i.e., such that 

Q\{G V , -<) = £t\{G v ) = o(l). (8.11) 

In general -< differs from <, but it clearly corresponds to some order < v 
on [0, 1] and, by (|8.8p again, 

ni(G u , ■<) = n*{w Gv ,< v ) = n*(w G „, -<„). (8.12) 

By Lemma ES and <[57TO)t — <jgTT2]) . we then have 

n*(w, < v ) < n*{w Guy < v ) + 2\\w- w Gv || D 0, 
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as v — > oo; hence £l*(W) 
Theorem EUl 



and r = T\v is a threshold graph limit by 



(ii) =^ (hi) Suppose that (hi) fails; then there exists e > and a subse- 



quence for which d e (G u ,T) > e\G u \ . We may select a subsubsequence such 



implies (hi) in this case, which 



that G u converges; we shall show that 
yields a contradiction. 

Suppose then that G v — > T for some graph limit T, and that |(ii)| holds. 
By assumption, T is a threshold graph limit. Let W be a kernel on [0, 1] 
representing T. By the result of Diaconis, Holmes and Janson 0] discussed 
in the introduction, we may choose W to be monotone and 0/1-valued. 

We have G u — >■ W, and thus Sn(W Gl/ ,W) — > 0. As above, by @, Lemma 
5.3] we may choose the labelling of the vertices in G v such that \\W Gl/ — 
W\\d -> 0. By LemmaEH this implies \\W Gv - W\\ L i -»■ 0. 

Since, by assumption, T is a threshold graph limit, there exists a sequence 
of threshold graphs G' v such that G' v — > T, and we may further assume that 
\G' U \ = \G U \. (For example, we may a.s. take G' v as the random graph 
G(n u , W) with n v = \G V \.) Then also 5 n {W Gl , W) -»■ 0, and by @, Lemma 
5.3] again we may choose the labelling of the vertices in G' u such that \\Wqi — 
W\\r 



^0, and thus by LemmaEIS] \\W G > v - W\\ L i -»■ 0. 
Wg v ~ W G ,J L i < \\W Gv ~ W\\ L i + \\W - W G 
We may identify the vertex sets of G v and G' v . Then 
d e (G u ,T) < \E(G U )AE(G' U )\ = ±\G u \ 2 \\W Gv -W, 



Consequently, 
\\r,i ^0. 



G' WL 1 



o{\GA 2 )- 



ni 



IV 



IV 



by the definition (|8.6 



(v)by 



\W Gu -W G J z i (s2) = 2\G„\- 2 \E(G U )AE(G / V 



(v) 



VI 



VI 



since 



< 



(i) 



|L!(5 2 )- 



Let < be the order on V(G U ) = V(G' U ) defined by the 
degrees of the vertices in G' u . Then, since G' v is a threshold graph, Nq^(v) C 
N G > v (w) U {w} whenever v < w, and thus Qq{G' u , <) = by (|8.1|) . 
By d£2J and Lemma EH 

n* (Gu, <) = n* (G v , <) - n* (G' u , <) = ni(G u , <) - nt(G' u , <) + o(i) 
= h*(w Gu ,<)-n*(w G/ „,<) + o(i) 

<2\\W Gv -W G , u \\u + o{l)=o{l). 
Hence n%(G v ) ->■ 0. □ 
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